129 research outputs found
Testing whether linear equations are causal: A free probability theory approach
We propose a method that infers whether linear relations between two
high-dimensional variables X and Y are due to a causal influence from X to Y or
from Y to X. The earlier proposed so-called Trace Method is extended to the
regime where the dimension of the observed variables exceeds the sample size.
Based on previous work, we postulate conditions that characterize a causal
relation between X and Y. Moreover, we describe a statistical test and argue
that both causal directions are typically rejected if there is a common cause.
A full theoretical analysis is presented for the deterministic case but our
approach seems to be valid for the noisy case, too, for which we additionally
present an approach based on a sparsity constraint. The discussed method yields
promising results for both simulated and real world data
Distinguishing cause from effect using observational data: methods and benchmarks
The discovery of causal relationships from purely observational data is a
fundamental problem in science. The most elementary form of such a causal
discovery problem is to decide whether X causes Y or, alternatively, Y causes
X, given joint observations of two variables X, Y. An example is to decide
whether altitude causes temperature, or vice versa, given only joint
measurements of both variables. Even under the simplifying assumptions of no
confounding, no feedback loops, and no selection bias, such bivariate causal
discovery problems are challenging. Nevertheless, several approaches for
addressing those problems have been proposed in recent years. We review two
families of such methods: Additive Noise Methods (ANM) and Information
Geometric Causal Inference (IGCI). We present the benchmark CauseEffectPairs
that consists of data for 100 different cause-effect pairs selected from 37
datasets from various domains (e.g., meteorology, biology, medicine,
engineering, economy, etc.) and motivate our decisions regarding the "ground
truth" causal directions of all pairs. We evaluate the performance of several
bivariate causal discovery methods on these real-world benchmark data and in
addition on artificially simulated data. Our empirical results on real-world
data indicate that certain methods are indeed able to distinguish cause from
effect using only purely observational data, although more benchmark data would
be needed to obtain statistically significant conclusions. One of the best
performing methods overall is the additive-noise method originally proposed by
Hoyer et al. (2009), which obtains an accuracy of 63+-10 % and an AUC of
0.74+-0.05 on the real-world benchmark. As the main theoretical contribution of
this work we prove the consistency of that method.Comment: 101 pages, second revision submitted to Journal of Machine Learning
Researc
On the links between sub-seasonal clustering of extreme precipitation and high discharge in Switzerland and Europe
River discharge is impacted by the sub-seasonal (weekly to monthly) temporal structure of precipitation. One example is the successive occurrence of extreme precipitation events over sub-seasonal timescales, referred to as temporal clustering. Its potential effects on discharge have received little attention. Here, we address this topic by analysing discharge observations following extreme precipitation events either clustered in time or occurring in isolation. We rely on two sets of precipitation and discharge data, one centred on Switzerland and the other over Europe. We identify âclusteredâ extreme precipitation events based on the previous occurrence of another extreme precipitation within a given time window. We find that clustered events are generally followed by a more prolonged discharge response with a larger amplitude. The probability of exceeding the 95th discharge percentile in 5âd following an extreme precipitation event is in particular up to twice as high for situations where another extreme precipitation event occurred in the preceding week compared to isolated extreme precipitation events. The influence of temporal clustering on discharge decreases as the clustering window increases; beyond 6â8 weeks the difference in discharge response with non-clustered events is negligible. Catchment area, streamflow regime and precipitation magnitude also modulate the response. The impact of clustering is generally smaller in snow-dominated and large catchments. Additionally, particularly persistent periods of high discharge tend to occur in conjunction with temporal clusters of precipitation extremes
Insights into the drivers and spatio-temporal trends of extreme Mediterranean wildfires with statistical deep-learning
Extreme wildfires are a significant cause of human death and biodiversity
destruction within countries that encompass the Mediterranean Basin. Recent
worrying trends in wildfire activity (i.e., occurrence and spread) suggest that
wildfires are likely to be highly impacted by climate change. In order to
facilitate appropriate risk mitigation, we must identify the main drivers of
extreme wildfires and assess their spatio-temporal trends, with a view to
understanding the impacts of global warming on fire activity. We analyse the
monthly burnt area due to wildfires over a region encompassing most of Europe
and the Mediterranean Basin from 2001 to 2020, and identify high fire activity
during this period in Algeria, Italy and Portugal. We build an extreme quantile
regression model with a high-dimensional predictor set describing
meteorological conditions, land cover usage, and orography. To model the
complex relationships between the predictor variables and wildfires, we use a
hybrid statistical deep-learning framework that can disentangle the effects of
vapour-pressure deficit (VPD), air temperature, and drought on wildfire
activity. Our results highlight that whilst VPD, air temperature, and drought
significantly affect wildfire occurrence, only VPD affects wildfire spread. To
gain insights into the effect of climate trends on wildfires in the near
future, we focus on August 2001 and perturb temperature according to its
observed trends (median over Europe: +0.04K per year). We find that, on average
over Europe, these trends lead to a relative increase of 17.1\% and 1.6\% in
the expected frequency and severity, respectively, of wildfires in August 2001,
with spatially non-uniform changes in both aspects
Hotspots and drivers of compound marine heatwaves and low net primary production extremes
Extreme events can severely impact marine organisms and ecosystems. Of particular concern are multivariate compound events, namely when conditions are simultaneously extreme for multiple ocean ecosystem stressors. In 2013â2015 for example, an extensive marine heatwave (MHW), known as the Blob, co-occurred locally with extremely low net primary productivity (NPPX) and negatively impacted marine life in the northeast Pacific. Yet, little is known about the characteristics and drivers of such multivariate compound MHWâNPPX events. Using five different satellite-derived net primary productivity (NPP) estimates and large-ensemble-simulation output of two widely used and comprehensive Earth system models, the Geophysical Fluid Dynamics Laboratory (GFDL) ESM2M-LE and Community Earth System Model version 2 (CESM2-LE), we assess the present-day distribution of compound MHWâNPPX events and investigate their potential drivers on the global scale. The satellite-based estimates and both models reveal hotspots of frequent compound events in the center of the equatorial Pacific and in the subtropical Indian Ocean, where their occurrence is at least 3 times higher (more than 10âdâyrâ1) than if MHWs (temperature above the seasonally varying 90th-percentile threshold) and NPPX events (NPP below the seasonally varying 10th-percentile threshold) were to occur independently. However, the models show disparities in the northern high latitudes, where compound events are rare in the satellite-based estimates and GFDL ESM2M-LE (less than 3âdâyrâ1) but relatively frequent in CESM2-LE. In the Southern Ocean south of 60ââS, low agreement between the observation-based estimates makes it difficult to determine which of the two models better simulates MHWâNPPX events. The frequency patterns can be explained by the drivers of compound events, which vary among the two models and phytoplankton types. In the low latitudes, MHWs are associated with enhanced nutrient limitation on phytoplankton growth, which results in frequent compound MHWâNPPX events in both models. In the high latitudes, NPPX events in GFDL ESM2M-LE are driven by enhanced light limitation, which rarely co-occurs with MHWs, resulting in rare compound events. In contrast, in CESM2-LE, NPPX events in the high latitudes are driven by reduced nutrient supply that often co-occurs with MHWs, moderates phytoplankton growth, and causes biomass to decrease. Compound MHWâNPPX events are associated with a relative shift towards larger phytoplankton in most regions, except in the eastern equatorial Pacific in both models, as well as in the northern high latitudes and between 35 and 50ââS in CESM2-LE, where the models suggest a shift towards smaller phytoplankton, with potential repercussions on marine ecosystems. Overall, our analysis reveals that the likelihood of compound MHWâNPPX events is contingent on model representation of the factors limiting phytoplankton production. This identifies an important need for improved process understanding in Earth system models used for predicting and projecting compound MHWâNPPX events and their impacts.</p
- âŠ